image processing
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China (0.04)
Deep Fractional Fourier Transform
Existing deep learning-based computer vision methods usually operate in the spatial and frequency domains, which are two orthogonal \textbf{individual} perspectives for image processing.In this paper, we introduce a new spatial-frequency analysis tool, Fractional Fourier Transform (FRFT), to provide comprehensive \textbf{unified} spatial-frequency perspectives.The FRFT is a unified continuous spatial-frequency transform that simultaneously reflects an image's spatial and frequency representations, making it optimal for processing non-stationary image signals.We explore the properties of the FRFT for image processing and present a fast implementation of the 2D FRFT, which facilitates its widespread use.Based on these explorations, we introduce a simple yet effective operator, Multi-order FRactional Fourier Convolution (MFRFC), which exhibits the remarkable merits of processing images from more perspectives in the spatial-frequency plane. Our proposed MFRFC is a general and basic operator that can be easily integrated into various tasks for performance improvement.We experimentally evaluate the MFRFC on various computer vision tasks, including object detection, image classification, guided super-resolution, denoising, dehazing, deraining, and low-light enhancement. Our proposed MFRFC consistently outperforms baseline methods by significant margins across all tasks.
Life-IQA: Boosting Blind Image Quality Assessment through GCN-enhanced Layer Interaction and MoE-based Feature Decoupling
Tang, Long, Zhen, Guoquan, Hao, Jie, Zhang, Jianbo, Duan, Huiyu, Yuan, Liang, Zhai, Guangtao
Abstract--Blind image quality assessment (BIQA) plays a crucial role in evaluating and optimizing visual experience. Most existing BIQA approaches fuse shallow and deep features extracted from backbone networks, while overlooking the unequal contributions to quality prediction. Moreover, while various vision encoder backbones are widely adopted in BIQA, the effective quality decoding architectures remain underexplored. T o address these limitations, this paper investigates the contributions of shallow and deep features to BIQA, and proposes a effective quality feature decoding framework via GCN-enhanced l ayeri nteraction and MoE-based f eature de coupling, termed (Life-IQA). Specifically, the GCN-enhanced layer interaction module utilizes the GCN-enhanced deepest-layer features as query and the penultimate-layer features as key, value, then performs cross-attention to achieve feature interaction. Moreover, a MoE-based feature decoupling module is proposed to decouple fused representations though different experts specialized for specific distortion types or quality dimensions.
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Vision (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
FusionFM: All-in-One Multi-Modal Image Fusion with Flow Matching
Zhu, Huayi, Shu, Xiu, Xiong, Youqiang, Liu, Qiao, Chen, Rui, Yuan, Di, Chang, Xiaojun, He, Zhenyu
Current multi-modal image fusion methods typically rely on task-specific models, leading to high training costs and limited scalability. While generative methods provide a unified modeling perspective, they often suffer from slow inference due to the complex sampling trajectories from noise to image. To address this, we formulate image fusion as a direct probabilistic transport from source modalities to the fused image distribution, leveraging the flow matching paradigm to improve sampling efficiency and structural consistency. To mitigate the lack of high-quality fused images for supervision, we collect fusion results from multiple state-of-the-art models as priors, and employ a task-aware selection function to select the most reliable pseudo-labels for each task. We further introduce a Fusion Refiner module that employs a divide-and-conquer strategy to systematically identify, decompose, and enhance degraded components in selected pseudo-labels. For multi-task scenarios, we integrate elastic weight consolidation and experience replay mechanisms to preserve cross-task performance and enhance continual learning ability from both parameter stability and memory retention perspectives. Our approach achieves competitive performance across diverse fusion tasks, while significantly improving sampling efficiency and maintaining a lightweight model design. The code will be available at: https://github.com/Ist-Zhy/FusionFM.
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > China > Chongqing Province > Chongqing (0.04)
- Oceania > Australia (0.04)
- (3 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.89)
Deep Unfolded BM3D: Unrolling Non-local Collaborative Filtering into a Trainable Neural Network
Basim, Kerem, Unal, Mehmet Ozan, Ertas, Metin, Yildirim, Isa
Block-Matching and 3D Filtering (BM3D) exploits non-local self-similarity priors for denoising but relies on fixed parameters. Deep models such as U-Net are more flexible but often lack interpretability and fail to generalize across noise regimes. In this study, we propose Deep Unfolded BM3D (DU-BM3D), a hybrid framework that unrolls BM3D into a trainable architecture by replacing its fixed collaborative filtering with a learnable U-Net denoiser. This preserves BM3D's non-local structural prior while enabling end-to-end optimization. We evaluate DU-BM3D on low-dose CT (LDCT) denoising and show that it outperforms classic BM3D and standalone U-Net across simulated LDCT at different noise levels, yielding higher PSNR and SSIM, especially in high-noise conditions.
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.06)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.06)
- North America > Canada > Alberta (0.04)
- Asia > China > Sichuan Province > Chengdu (0.04)
- Information Technology > Communications (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.97)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Variational Denoising Network: Toward Blind Noise Modeling and Removal
Zongsheng Yue, Hongwei Yong, Qian Zhao, Deyu Meng, Lei Zhang
On one hand, as other data-driven deep learning methods, our method, namely variational denoising network (VDN), can perform denoising efficiently due to its explicit form of posterior expression. On the other hand, VDN inherits the advantages of traditional model-driven approaches, especially the good generalization capability of generative models.
- Asia > Macao (0.04)
- North America > Canada > Alberta (0.04)
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- (4 more...)
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Europe > Poland (0.04)
- (2 more...)
- North America > United States > Illinois (0.04)
- South America > Peru > Loreto Department (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)